Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms Ecole Normale Supérieure De Lyon Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms
نویسندگان
چکیده
In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as many parallel loops as the intrinsic degree of parallelism of the nest authorizes, while avoiding a full memory expansion. We try to reduce the number of temporary arrays that we introduce, as well as their dimension. Dans ce rapport, nous pr esentons une rapide synth ese des techniques uselles pour l' elimination des \fausses d ependances" (anti-d ependances et d ependances en sortie). Ces techniques requi erent le plus souvent l'emploi de tableaux auxiliaires. Nous montrons comment int egrer ces techniques dans les algorithmes de parall elisation de boucles (comme celui d'Allen et Kennedy). Notre objectif est d'obtenir autant de boucles parall eles que le programme original en contient potentiellement, tout en evi-tant une expansion m emoire compl ete. Nous tentons de r eduire le nombre de tableaux auxiliaires introduits, ainsi que leur dimension. Abstract In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as many parallel loops as the intrinsic degree of parallelism of the nest authorizes, while avoiding a full memory expansion. We try to reduce the number of temporary arrays that we introduce, as well as their dimension.
منابع مشابه
Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms
In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as...
متن کاملPlugging Anti and Output Dependence Removal Techniques Into Loop Parallelization Algorithm
In this paper we shortly survey some loop transformation techniques which break anti or output dependence& or artificial cycles involving such ‘false’ dependences. These false dependences are removed through the introduction of temporary buffer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy’s algorithm). The goal is to extract a...
متن کاملCombining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling Ecole Normale Supérieure De Lyon Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling
Tiling is a technique used for exploitingmedium grain parallelism in nested loops It relies on a rst step that detects sets of permutable nested loops All algorithms developed so far consider the statements of the loop body as a single block in other words they are not able to take advantage of the structure of dependences between di erent statements In this report we overcome this limitation b...
متن کاملThe Bouclettes Loop Parallelizer Ecole Normale Supérieure De Lyon the Bouclettes Loop Parallelizer
Bouclettes is a source to source loop nest parallelizer It takes as input Fortran uniform perfectly nested loops and gives as output an HPF High Performance Fortran program with data distribution and parallel HPF INDEPENDENT loops This paper presents the tool and the underlying parallelization methodology
متن کاملCombining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling
Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a rst step that detects sets of permutable nested loops. All algorithms developed so far consider the statements of the loop body as a single block, in other words, they are not able to take advantage of the structure of dependences between diierent statements. In this paper, we overcome this limita...
متن کامل